Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

نویسندگان

Tim Salimans

Diederik P. Kingma

چکیده

We present weight normalization: a reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction. By reparameterizing the weights in this way we improve the conditioning of the optimization problem and we speed up convergence of stochastic gradient descent. Our reparameterization is inspired by batch normalization but does not introduce any dependencies between the examples in a minibatch. This means that our method can also be applied successfully to recurrent models such as LSTMs and to noise-sensitive applications such as deep reinforcement learning or generative models, for which batch normalization is less well suited. Although our method is much simpler, it still provides much of the speed-up of full batch normalization. In addition, the computational overhead of our method is lower, permitting more optimization steps to be taken in the same amount of time. We demonstrate the usefulness of our method on applications in supervised image recognition, generative modelling, and deep reinforcement learning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Per-weight Class-based Learning Rates via Analytical Continuation

We study the problem of training deep fully connected neural networks. Despite much progress in the design of activation functions, novel normalization techniques, and various skip-connection techniques, such networks remain challenging to train due to vanishing or exploding gradients. Our method is based on employing a different class-dependent learning rate to each network weight. Since the l...

متن کامل

Comparison of Batch Normalization and Weight Normalization Algorithms for the Large-scale Image Classification

Batch normalization (BN) has become a de facto standard for training deep convolutional networks. However, BN accounts for a significant fraction of training run-time and is difficult to accelerate, since it is memory-bandwidth bounded. Such a drawback of BN motivates us to explore recently proposed weight normalization algorithms (WN algorithms), i.e. weight normalization, normalization propag...

متن کامل

Understanding symmetries in deep networks

Recent works have highlighted scale invariance or symmetry present in the weight space of a typical deep network and the adverse effect it has on the Euclidean gradient based stochastic gradient descent optimization. In this work, we show that a commonly used deep network, which uses convolution, batch normalization, reLU, max-pooling, and sub-sampling pipeline, possess more complex forms of sy...

متن کامل

Estimation of Hand Skeletal Postures by Using Deep Convolutional Neural Networks

Hand posture estimation attracts researchers because of its many applications. Hand posture recognition systems simulate the hand postures by using mathematical algorithms. Convolutional neural networks have provided the best results in the hand posture recognition so far. In this paper, we propose a new method to estimate the hand skeletal posture by using deep convolutional neural networks. T...

متن کامل

Cystoscopy Image Classication Using Deep Convolutional Neural Networks

In the past three decades, the use of smart methods in medical diagnostic systems has attractedthe attention of many researchers. However, no smart activity has been provided in the eld ofmedical image processing for diagnosis of bladder cancer through cystoscopy images despite the highprevalence in the world. In this paper, two well-known convolutional neural networks (CNNs) ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

نویسندگان

چکیده

منابع مشابه

Per-weight Class-based Learning Rates via Analytical Continuation

Comparison of Batch Normalization and Weight Normalization Algorithms for the Large-scale Image Classification

Understanding symmetries in deep networks

Estimation of Hand Skeletal Postures by Using Deep Convolutional Neural Networks

Cystoscopy Image Classication Using Deep Convolutional Neural Networks

عنوان ژورنال:

اشتراک گذاری